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Oh We consider the problem of dualizing a monotone CNF (equivalently, computing all minimal 

| transversals of a hypergraph), whose associated decision problem is a prominent open problem 

' in NP-completeness. We present a number of new polynomial time resp. output-polynomial time 

CNI ■ results for significant cases, which largely advance the tractability frontier and improve on previous 

results. Furthermore, we show that duality of two monotone CNFs can be disproved with limited 
nondeterminism. More precisely, this is feasible in polynomial time with 0(x(n) ■ logn) suitably 
£^ . guessed bits, where x( n ) i s given by x{ n ) x ^ = n \ note that x( n ) = o(logn). This result sheds 

^ ' new light on the complexity of this important problem. 
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O ■ 1 Introduction 

O 

Recall that the prime CNF of a monotone Boolean function / is the unique formula ip = Aces c m 
conjunctive normal form where S is the set of all prime implicates of /, i.e., minimal clauses c which 
are logical consequences of /. In this paper, we consider the following problem: 





Problem DUALIZATION 




Input: 


The prime CNF <p of a monotone Boolean function / = f(xi, . . 


• j x m ). 


Output: 


The prime CNF if) of its dual f d = f(x\ , . . . , x m ). 





It is well known that DUALIZATION is equivalent to the TRANSVERSAL COMPUTATION problem, 
which requests to compute the set of all minimal transversals (i.e., minimal hitting sets) of a given 
hypergraph TC, in other words, the transversal hypergraph Tr(H) of TL Actually, these problems can be 
viewed as the same problem, if the clauses in a monotone CNF ip are identified with the sets of variables 
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they contain. DUALIZATION is a search problem; the associated decision problem Dual is to decide 
whether two given monotone prime CNFs ip and ip represent a pair (/, g) of dual Boolean functions. 
Analogously, the decision problem Trans-Hyp associated with TRANSVERSAL COMPUTATION is 
deciding, given hypergraphs TL and Q, whether Q = Tr(TL). 

DUALIZATION and several problems which are like transversal computation known to be com- 
putationally equivalent to problem DUALIZATION (see [15]) are of interest in various areas such as 
database theory (e.g. [38, ^]), machine learning and data mining (e.g., ^ 0, [l^, ^2|]), game theory (e.g. 
[||, ||, ||]), artificial intelli gence (e.g., [|2l|, |28|, |29|, pi]]), mathematical programming (e.g., and 
distributed systems (e.g., [18, 27]) to mention a few. 

While the output CNF ip can be exponential in the size of ip, it is currently not known whether ip can 
be computed in output-polynomial (or polynomial total) time, i.e., in time polynomial in the combined 
size of ip and ip. Any such algorithm for DUALIZATION (or for TRANSVERSAL COMPUTATION) would 
significantly advance the state of the art of several problems in the above application areas. Similarly, 
the complexity of Dual (equivalently, Trans-Hyp) is open since more than 20 years now (cf. [Q, |l5j , 
3^|l|,|3]l). 

Note that DUALIZATION is solvable in polynomial total time on a class C of hypergraphs iff Dual 
is in PTIME for all pairs (H,<3), where % G C [f||. Dual is known to be in co-NP and the best cur- 
rently known upper time-bound is quasi-polynomial time [|l7|, |l^, 47 1. Determining the complexities of 
DUALIZATION and Dual, and of equivalent problems such as the transversal problems, is a prominent 
open problem. This is witnessed by the fact that these problems are cited in a rapidly growing body 
of literature and have been referenced in various survey papers and complexity theory retrospectives, 
e.g. [0, 0,0. 

Given the importance of monotone dualization and equivalent problems for many application areas, 
and given the long standing failure to settle the complexity of these problems, emphasis was put on 
finding tractable cases of DUAL and corresponding polynomial total-time cases of DUALIZATION. In 
fact, several relevant tractable classes were found by various authors; see e.g. [@, §, |, [K| [l|, 0, [jj 



20, |5|, 36, 39, f4|] and references therein. Moreover, classes of formulas were identified on which 
DUALIZATION is not just polynomial total-time, but where the conjuncts of the dual formula can be 
enumerated with incremental polynomial delay, i.e., with delay polynomial in the size of the input plus 
the size of all conjuncts so far computed, or even with polynomial delay, i.e., with delay polynomial 
in the input size only. On the other hand, there are also results which show that certain well-known 
algorithms for DUALIZATION are not polynomial-total time. For example, [15, 39] pointed out that a 
well-known sequential algorithm, in which the clauses a of a CNF <p = c\ A • • • A c m are processed in 
order i = 1, . . . , m, is not polynomial-total time in general. Most recently, [46] showed that this holds 
even if an optimal ordering of the clauses is assumed (i.e., they may be arbitrarily arranged for free). 

Main Goal. The main goal of this paper is to present important new polynomial total time cases of 
DUALIZATION and, correspondingly, PTIME solvable subclasses of DUAL which significantly improve 
previously considered classes. Towards this aim, we first present a new algorithm Dualize and prove its 
correctness. Dualize can be regarded as a generalization of a related algorithm proposed by Johnson, 
Yannakakis, and Papadimitriou [31]. As other dualization algorithms, DUALIZE reduces the original 



problem by self-reduction to smaller instances. However, the subdivision into subproblems proceeds 
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according to a particular order which is induced by an arbitrary fixed ordering of the variables. This, 
in turn, allows us to derive some bounds on intermediate computation steps which imply that DUAL- 
IZE, when applied to a variety of input classes, outputs the conjuncts of ip with polynomial delay or 
incremental polynomial delay. In particular, we show positive results for the following input classes: 



• Degenerate CNFs. We generalize the notion of ^-degenerate graphs [ pOQ to hypergraphs and 
define k-degenerate monotone CNFs resp. hypergraphs. We prove that for any constant k, DUALIZE 
works with polynomial delay on fc-degenerate CNFs. Moreover, it works in output-polynomial time on 
O (log n) -degenerate CNFs. 

• Read-A; CNFs. A CNF is read-k, if each variable appears at most k times in it. We show that for 
read-/c CNFs, problem DUALIZATION is solvable with polynomial delay, if k is constant, and in total 
polynomial time, if k = O (log ( 1 1 ip 1 1 ) . Our result for constant k significantly improves upon the previous 
best known algorithm [|L?|], which has a higher complexity bound, is not polynomial delay, and outputs 
the clauses of ip in no specific order. The result for k = 0(log ||</?||) is a non-trivial generalization of 



the result in Q12|], which was posed as an open problem [jl 



Acyclic CNFs. There are several notions of hypergraph resp. monotone CNF acyclicity Q16|], where 



the most general and well-known is a-acyclicity. As shown in J15[], DUALIZATION is polynomial total 
time for /3-acyclic CNFs; /3-acyclicity is the hereditary version of a-acyclicity and far less general. A 
similar result for a-acyclic prime CNFs was left open. (For non-prime a-acyclic CNFs, this is trivially 
as hard as the general case.) In this paper, we give a positive answer and show that for a-acyclic (prime) 
ip, DUALIZATION is solvable with polynomial delay. 

• Formulas of Bounded Treewidth. The treewidth [ |45| ] of a graph expresses its degree of cyclicity. 
Treewidth is an extremely general notion, and bounded treewidth generalizes almost all other notions of 
near-acyclicity. Following [|13|], we define the treewidth of a hypergraph resp. monotone CNF ip as the 
treewidth of its associated (bipartite) variable-clause incidence graph. We show that DUALIZATION is 
solvable with polynomial delay (exponential in k) if the treewidth of ip is bounded by a constant k, and 
in polynomial total time if the treewidth is 0(log log \\p\\). 

• Recursive Applications of Dualize and /c-CNFs. We show that if Dualize is applied recur- 
sively and the recursion depth is bounded by a constant, then DUALIZATION is solved in polynomial 
total time. We apply this to provide a simpler proof of the known result [T^] that monotone /c-CNFs 
(where each conjunct contains at most k variables) can be dualized in output-polynomial time. 

After deriving the above results, we turn our attention (in Section |5|) to the fundamental computa- 
tional nature of problems Dual and Trans-Hyp in terms of complexity theory. 



Limited nondeterminism. In a landmark paper, Fredman and Khachiyan [17] proved that problem 
Dual can be solved in quasi-polynomial time. More precisely, they first gave an algorithm A solving 
the problem in n°( log ™) time, and then a more complicated algorithm B whose runtime is bounded 



by n 4x * n ) +0( ' 1 * ) where x( n ) i s defined by x( n ) n ^ = n - As noted in [17], x( n ) ~ log n/ log log n 
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o (log re); therefore, duality checking is feasible in re°( logn ) time. This is the best upper bound for 
problem Dual so far, and shows that the problem is most likely not NP-complete. 

A natural question is whether Dual lies in some lower complexity class based on other resources 
than just runtime. In the present paper, we advance the complexity status of this problem by showing 
that its complement is feasible with limited nondeterminism, i.e, by a nondeterministic polynomial-time 
algorithm that makes only a poly-logarithmic number of guesses. For a survey on complexity classes 
with limited nondeterminism, and for several references see [23]. We first show by using a simple but 
effective technique, which succinctly describes computation paths, that testing non-duality is feasible 
in polynomial time with O (log 3 n) nondeterministic steps. We then observe that this approach can 
be improved to obtain a bound of 0(x(n) • logn) nondeterministic steps. This result is surprising, 
because most researchers dealing with the complexity o/DUAL and TRANS-HYP believed so far that 
these problems are completely unrelated to limited nondeterminism. 

We believe that the results presented in this paper are significant, and we are confident that they will 
be prove useful in various contexts. First, we hope that the various polynomial/output-polynomial cases 
of the problems which we identify will lead to better and more general methods in various application 
areas (as we show, e.g. in learning and data mining [|l2|]), and that based on the algorithm DUALIZE or 
some future modifications, further relevant tractable classes will be identified. Second, we hope that our 
discovery on limited nondeterminism provides a new momentum to complexity research on DUAL and 
Trans-Hyp, and will push it towards settling these longstanding open problems. 

The rest of this paper is structured as follows. The next section provides some preliminaries and 
introduces notation. In Section |3[ we present our algorithm DUALIZE for dualizing a given monotone 
prime CNF. After that, we exploit this algorithm in Section ^ to derive a number of polynomial instance 
classes of the problems DUALIZATION and DUAL. In Section [5] we then show that Dual can be solved 
with limited nondeterminism. 



2 Preliminaries and Notation 

A Boolean function (in short, function) is a mapping / : {0, 1}™ — ► {0, 1}, where v G {0, l} n is called 
a Boolean vector (in short, vector). As usual, we write g < f if / and g satisfy g(v) < f(v) for all 
v G {0, l} n , and g < f if g < f and g ^ f. A function / is monotone (or positive), if v < w 
(i.e., Vi < Wi for all i) implies f(v) < f(w) for all v,w G {0, l} n . Boolean variables x±,X2, . . . ,x n 
and their complements x±, X2, ■ ■ ■ , x n are called literals. A clause (resp., term) is a disjunction (resp., 
conjunction) of literals containing at most one of Xi and Xi for each variable. A clause c (resp., term t) 
is an implicate (resp., implicant) of a function /, if / < c (resp., t < /); moreover, it is prime, if there 
is no implicate d < c (resp., no implicant t' > t) of /, and monotone, if it consists of positive literals 
only. We denote by PI(f) the set of all prime implicants of /. 

A conjunctive normal form (CNF) (resp., disjunctive normal form, DNF) is a conjunction of clauses 
(resp., disjunction of terms); it is prime (resp. monotone), if all its members are prime (resp. monotone). 
For any CNF (resp., DNF) p, we denote by \p\ the number of clauses (resp., terms) in it. Furthermore, 
for any formula ip, we denote by V(ip) the set of variables that occur in tp, and by \\ip\\ its length, i.e., 
the number of literals in it. We occasionally view CNFs ip also as sets of clauses, and clauses as sets of 
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literals, and use respective notation (e.g., c G ip, x\ G c etc). 

As well-known, a function / is monotone iff it has a monotone CNF. Furthermore, all prime im- 
plicants and prime implicates of a monotone / are monotone, and it has a unique prime CNF, given 
by the conjunction of all its prime implicates. For example, the monotone / such that f(v) = 1 iff 
v G {(1100), (1110), (1101), (0111), (1111)} has the unique prime CNF ip = x 2 {x 1 V x 3 )(xi V x A ). 

Recall that the dual of a function /, denoted f d , is defined by f d (x) = f(x), where / and x is the 
complement of / and x, respectively. By definition, we have (f d ) d = f. From De Morgan's law, we 
obtain a formula for f d from any one of / by exchanging V and A as well as the constants and 1. For 
example, if / is given by ip = x\x 2 V x\ (X3 V X4), then f d is represented by ip = (xi V x 2 ){xi V X3X4). 
For a monotone function /, let ip = A c ec(V x ec x i) tne P r i me CNF of f d . Then by De Morgan's 
law, / has the (unique) prime DNF p = V ce c(f\ Xl ec x i)> i n tne previous example, p = x\x 2 V £2X3X4. 
Thus, we will regard DUALIZATION also as the problem of computing the prime DNF of / from the 
prime CNF of /. 

3 Ordered Transversal Generation 

In what follows, let / be a monotone function and 

m 

V = /\ch (1) 

i=i 

its prime CNF, where we assume without loss of generality that all variables Xj(j = l,2,...n) appear 
in ip. Let ipi (i = 0, 1, . . . , n) be the CNF obtained from ip by fixing variables Xj = 1 for all j with 
j > i + By definition, we have (fo = 1 (truth) and p> n = p>. For example, consider ip = (xi\/ x 2 )(xi V 
X3XX2 V X3 V X4)(xi V X4). Then we have ipo = (p± = 1, p> 2 = (xi V x 2 ), 993 = (xi V X2)(xi V X3), 
and 994 = (p. Similarly, for the prime DNF 

V> = VteP/(/)* ( 2 ) 

of /, we denote by ipi the DNF obtained from ip by fixing variables X j = 1 for all j with j > i + 1. 
Clearly, we have ipi = ipi, i.e., ipi and ipi represent the same function denoted by /j. 

Proposition 3.1 Let ip and ip be any CNF and DNF for f, respectively. Then, for all i > 0, 

(a) ||<Pi|| < |M| and \ipi\ < \ip\, and 

(b) \\iPi\\ < U\\and\iPi\ < \iP\. 

Denote by A 1 (i = 1, 2, . . . , n) the CNF consisting of all the clauses in ipi but not in tpi-i. For the above 
example, we have A 1 = 1, A 2 = (x\ V X2), A 3 = (xi V x 3 ), and A 4 = (X2 V x 3 V x 4 )(xi V x 4 ). Note 
that ipi = ipi~\ A A*; hence, for alH = 1, 2, . . . , n we have 

iPi = ipi-\ A A* = \/ (t A A*). (3) 

t6P/(/i_l) 
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Let A l [t], for i = 1, . . . , n denote the CNF consisting of all the clauses c such that c contains no literal 
in and c V xi appears in A*. For example, if t = X2X3X4 and A 4 = (x2 V X3 V X4)(x\ V £4), then 
A 4 [t] = xi. It follows from (0) that for all i = 1, 2, . . . , n 

A = V ((* A A*[i]) V (tAa*)). (4) 
tePi(fi-i) 



Lemma 3.2 For every term t G PI(fi_i), let gi t t be the function represented by A 1 [t]. Then \PI(gij)\ < 
M < M- 

Proof. Let 1/ = {xi, £2, • • • > ^n} and let s G PI(gi^)- Then by (|j), i A s is an implicant of 7/^. Hence, 
some i s G ^(/i) exists such that t s > t A s. Note that n F(A*[t]) = 0, i and A l [i] have no 
variable in common, and hence we have V(s) C F(i s ) (C V(s) U since otherwise there exists a 

clause c in A* [t] such that F(c) n V(t s ) = 0, a contradiction. Thus V{t s ) n V(A*[t]) = F(s). For any 
s' G PI(gi,t) such that s 7^ s', let t s , t s ' G P/(/i) such that t s > t A s and t s ' >t A s', respectively. By 
the above discussion, we have t s ^t s . This completes the proof. □ 

We now describe our algorithm DUALIZE for generating PI(f). It is inspired by a similar graph 



algorithm of Johnson, Yannakakis, and Papadimitriou [31], and can be regarded as a generalization 



Algorithm Dualize 

Input: The prime CNF ip of a monotone function /. 

Output: The prime DNF ip of /, i.e. all prime implicants of /. 

Step 1: Compute the smallest prime implicant t m i n of / and set Q :— { t m i n }; 

Step 2: while Q ^ do begin 

Remove the smallest t from Q and output t; 

for each i with Xi £ V(t) and A l [t] =/= 1 do begin 

Compute the prime DNF P(t,i) of the function represented by A 1 [f]; 
for each term t' in P(t,%) do begin 

if A t' is a prime implicant of /j then begin 
Compute the smallest prime implicant t* of / such that t* — A i'; 
Q:=Q(j{t*} 
end{if} end{for} end{for} 
endjwhile} 



Here, we say that term s is smaller than term t if J2 X eV(s) 2™ ^ < S a -eV(t) ^™ J "' * e -' as vector > s i s 
lexicographically smaller than t. 

Theorem 3.3 Algorithm DUALIZE correctly outputs all t G PI(f) in increasing order. 
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Proof. (Sketch) First note that the term t* inserted in Q when t is output is larger than t. Indeed, t' 
1) and ti-\ are disjoint and V(t') C {x±,. . . , a?i_i}. Hence, every term in Q is larger than all terms 
already output, and the output sequence is increasing. We show by induction that, if t is the smallest 
prime implicant of / that was not output yet, then t is already in Q. This clearly proves the result. 

Clearly, the above statement is true if t = t m i n . Assume now that t ^ t m in is the smallest among 
the prime implicants not output yet. Let i be the largest index such that tj is not a prime implicant 
of /j. This i is well-defined, since otherwise t = t m in must hold, a contradiction. Now we have (1) 
i < n and (2) i + 1 $ V(t), where (1) holds because t n (= t) is a prime implicant of f n (= /) and (2) 
follows from the maximality of i. Let s G PI(fi) such that V(s) C V(U), and let K = V(U) - V(s). 
Then K ^ holds, and since Xj+i ^ V(t), the term i' = [\ X] ^K x i i s a P r i me implicant of A J+1 [s]. 
There exists s' G PI(f) such that = s and Xj+i G V(s'), since s A € PI(fi + \). Note that 
A l+1 [s] ^ 0. Moreover, since s' is smaller than by induction s' has already been output. Therefore, 
i! = /\ x . e K x j nas been considered in the inner for-loop of the algorithm. Since A t' (= = tj+i) 
is a prime implicant of /i+i, the algorithm has added the smallest prime implicant t* of / such that 
t* +1 = ti+i. We finally claim that t* = t. Otherwise, let k be the first index in which t* and t differ. 
Then k > i + 1, Xf- G V(t) and V(t*). However, this implies PI{fh), contradicting the 
maximality of i. □ 



Remark 3.1 (1) The decomposition rule (Q) was already used in [33]. 

(2) In step 1, we could generate any prime implicant t of /, and choose then a lexicographic term 
ordering inherited from a dynamically generated variable ordering. In step 2, it is sufficient that any 
monotone DNF tua of the function represented by A l [t] is computed, rather than its prime DNF P(t,i)- 
This might make the algorithm faster. 

Let us consider the time complexity of algorithm Dualize. We store Q as a binary tree, where each 
leaf represents a term t and the left (resp., right) son of a node at depth j — 1 > 0, where the root has 
depth 0, encodes xj G V(t) (resp., Xj V(t)). In Step 1, we can compute t m i n in 0(||y||) time and 
initialize Q in 0(n) time. 

As for Step 2, let Tujs be the time required to compute the prime DNF pu^ from A l [t]. By 
analyzing its substeps, we can see that each iteration of Step 2 requires J2x l &v(t){P(t,i) J r\P(t,i)\'0(\\(p\\)) 
time. 

Indeed, we can update Q (i.e., remove the smallest term and add £*) in 0(n) time. For each t and i, 
we can construct A l [t] in 0(|MI) time. Moreover, we can check whether A t 1 is a prime implicant 
of fi and if so, we can compute the smallest prime implicant t* of / such that t* = A t' in 0(||(/?||) 
time; note that t* is the smallest prime implicant of the function obtained from / by fixing xj = 1 if 
xj G V(ti A t') and if Xj V(U A tf) for j < i. 

Hence, we have the following result. 

Theorem 3.4 The output delay of Algorithm Dualize is bounded by 

E (T ( u) + \P(u)\ ■ 0(M))) (5) 
teF1(n xiev(t) 
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time, and DUALIZE needs in total time 

E E (T(t,i) + \P(t,i)\-0(\\<p\\)). (6) 
tePi(f) Xi ev(t) 

If the Tu{\ are bounded by a polynomial in the input length, then Dualize becomes a polynomial 
delay algorithm, since \p(t,i) I — ^(t,i) holds for all t G PI(f) and Xi G ^(i). On the other hand, if they 
are bounded by a polynomial in the combined input and output length, then DUALIZE is a polynomial 
total time algorithm, where \pu,i)\ < IV'I holds from Lemma \2. Using results from [|3j], we can 



construct from Dualize an incremental polynomial time algorithm for Dualization, which however 
might not output PI(f) in increasing order. Summarizing, we have the following corollary. 

Corollary 3.5 Let T = max{ Tun \ t G PI(f),Xi £ V(t) }. Then, ifT is bounded by a 

(i) polynomial in n and \\(f\\, then DUALIZE is an 0(n\\(p\\T) polynomial delay algorithm; 

(ii) polynomial in n, ||</?||, and ||^||, then DUALIZE is an 0(n ■ \ip\ ■ (T + ■ \\<f\\)) polynomial 
total-time algorithm; moreover, DUALIZATION is solvable in incremental polynomial time. 

In the next section, we identify sufficient conditions for the boundedness of T and fruitfully apply 
them to solve open problems and improve previous results. 

4 Polynomial Classes 
4.1 Degenerate CNFs 

We first consider the case of small A i [t]. Generalizing a notion for graphs (i.e., monotone 2-CNFs) 
[pQ], we call a monotone CNF ip k-degenerate, if there exists a variable ordering x\, . . . , x n in which 



| A* | < k for alii = 1,2, ... ,n. We call a variable ordering x\, . . . ,x n smallest last as in Q50[] , if 
Xi is chosen in the order i = n,n — 1, . . . , 1 such that |A*| is smallest for all variables that were not 
chosen. Clearly, a smallest last ordering gives the least k such that ip is A;-degenerate. Therefore, we can 
check for every integer k > 1 whether cp is fc-degenerate in 0(||c/?||) time. If this holds, then we have 
\P(t,i)\ — nk an d Tu{) = 0(kn k+1 ) for every t G PI{f) and i G V(t) (for Tr t A, apply the distributive 
law to A l [t] and remove terms t where some xj G V(t) has no c G A l [t] such that V(t)nV(c) = {xj}). 



Thus Theorem 3.4 implies the following. 



Theorem 4.1 For k-degenerate CNFs ip, DUALIZATION is solvable with 0(\\(p\\ ■ n k+1 ) polynomial 
delay ifk>\ is constant. 



Applying the result of [ 37 ] that log-clause CNF is dualizable in incremental polynomial time, we 



obtain a polynomiality result also for non-constant degeneracy: 

Theorem 4.2 For 0(log \\p\\)-degenerate CNFs ip, problem DUALIZATION is solvable in polynomial 
total time. 

In the following, we discuss several natural subclasses of degenerate CNFs. 
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4.1.1 Read-bounded CNFs 



A monotone CNF p is called read-k, if each variable appears in tp at most k times. Clearly, read-fc 
CNFs are /c-degenerate, and in fact p is read-A; iff it is /c-degenerate under every variable ordering. By 



applying Theorems 4.1 and 4.2, we obtain the following result. 

Corollary 4.3 For read-k CNFs tp, problem DUALIZATION is solvable 

(i) with 0(\\(p\\ ■ n k+1 ) polynomial delay, ifk is constant; 

(ii) in polynomial total time, ifk = 0(\og(\\p\\)). 

Note that Corollary O] (i) trivially implies that DUALIZATION is solvable in 0(|V>| ■ n k+2 ) time for 



constant k, since \\cp\\ < kn. This improves upon the previous best known algorithm [12], which is only 
0(\ip\ ■ n k+3 ) time, not polynomial delay, and outputs PI(f) in no specific order. Corollary 4.3 (ii) is a 
non-trivial generalization of the result in [O], which was posed as an open problem [11]. 

4.1.2 Acyclic CNFs 

Like in graphs, acyclicity is appealing in hypergraphs resp. monotone CNFs from a theoretical as well 
as a practical point of view. However, there are many notions of acyclicity for hypergraphs (cf. JT6[]), 
since different generalizations from graphs are possible. We refer to a-, (3-,j-, and fierge-acyclicity as 



stated in Q16[], for which the following proper inclusion hierarchy is known: 

Berge-acyclic C 7-acyclic C /3-acyclic C a-acyclic. 

The notion of a-acyclicity came up in relational database theory. A monotone CNF tp is a-acyclic iff 
ip = 1 or reducible by the GYO-reduction [25, |T|], i.e., repeated application of one of the two rules: 



(1) If variable xi occurs in only one clause c, remove X{ from c. 

(2) If distinct clauses c and c' satisfy V(c) C V(c'), remove c from ip. 

to (i.e., the empty clause). Note that a-acyclicity of a monotone CNF p can be checked, and a suitable 



GYO-reduction output, in 0(||</?||) time [48]. A monotone CNF <p is (3-acyclic iff every CNF consisting 



of clauses in p is a-acyclic. As shown in [|15|], the prime implicants of a monotone / represented by a 
/5-acyclic CNF p can be enumerated (and thus DUALIZATION solved) in p(||<^||) • \ip\ time, where p is 
a polynomial in \\<p\\. However, the time complexity of DUALIZATION for the more general a-acyclic 
prime CNFs was left as an open problem. We now show that it is solvable with polynomial delay, by 
showing that a-acyclic CNFs are 1-degenerate. 

Let <p 7^ 1 be a prime CNF. Let a = a\ , a% , . . . , a q be a GYO-reduction for tp, where at = X{ if 
the £-th operation removes Xi from c, and d£ = c if it removes c from ip. Consider the unique variable 
ordering b\, 62, • • • , b n such 6j occurs after bj in a, for all i < j. For example, let <p = C1C2C3C4, where 
ci = (a?i V X2 V 23), C2 = {x\ V X3 V X5), C3 = [x\ V15V xq) and C4 = (X3 V X4 V X5). Then p is 
a-acyclic, since it has the GYO-reduction 

ai = x 2 , a 2 = ci, a 3 = x 4 , a 4 = x 6 , a 5 = c 4 , a 6 = c 3 , 07 = x±, a 8 = x 3 , a 9 = x 5 . 
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From this sequence, we obtain the variable ordering 



bi = x 5 , b 2 = x 3 , b 3 = xx, 6 4 = x 6 , 65 = x A , b 6 = x 2 . 

As easily checked, this ordering shows that ip is 1-degenerate. Under this ordering, we have A 1 = A 2 = 

1, A 3 = (xi V x 3 V x 5 ), A 4 = (xt V x 5 V x 6 ), A 5 = (x 3 V14V x 5 ), and A 6 = (x x V x 2 V x 3 ). This is 
not accidental. 

Lemma 4.4 Every a-acyclic prime CNF is 1-degenerate. 

Note that the converse is not true, i.e., there exists a 1-degenerate CNF that is not a-acyclic. For 



example, ip = (x\ V x 2 V x 3 )(x\ VX2 VX4) (2:2 V 2:3 V X4VX5) * s such a CNF Lemma pT4] and Theorem [4~1 
imply the following result. 

Corollary 4.5 For a-acyclic CNFs ip, problem DUALIZATION is solvable with 0{\\ip\\ ■ n 2 ) delay. 

Observe that for a prime a-acyclic ip, we have \ip\ < n. Thus, if we slightly modify algorithm DUALIZE 
to check A* = 1 in advance (which can be done in linear time in a preprocessing phase) such that such 
A* need not be considered in step 2, then the resulting algorithm has 0(n • \ip\ ■ \\tp\\) delay. Observe 



that the algorithm in [ |15| ] solves, minorly adapted for enumerative output, DUALIZATION for /3-acyclic 



CNFs with 0(n ■ \<p\ ■ \\<p\\) delay. Thus, the above modification of Dualize is of the same order. 
4.1.3 CNFs with bounded treewidth 

A tree decomposition (of type I) of a monotone CNF ip is a tree T = (W, E) where each node w G W 
is labeled with a set X(w) C V(ip) under the following conditions: 

1. \J wEW X(w) = V(<p); 

2. for every clause c in ip, there exists some w G W such that V(c) C X(w); and 

3. for any variable Xi G V, the set of nodes {w G W \ x-i G X(w)} induces a (connected) subtree 
of T. 

The width of T is max wg v^ l^( w )l — 1» an d the treewidth of denoted by Tw\{ip), is the minimum 
width over all its tree decompositions. 



Note that the usual definition of treewidth for a graph [ ]45[ ] results in the case where ip is a 2-CNF. 
Similarly to acyclicity, there are several notions of treewidth for hypergraphs resp. monotone CNFs. For 
example, tree decomposition of type II of CNF ip = A c eC c ^ s defined as type-I tree decomposition 
of its incident 2-CNF (i.e., graph) G((p) Jl3| , p4Q . That is, for each clause c G <p> we introduce a new 
variable y c and construct G((p) = A Xi ec€(p(. x i V y c ) (here, Xj G c denotes that x-i appears in c). Let 
Tw 2 (ip) denote the type-II treewidth of ip. 

Proposition 4.6 For every monotone CNF ip, it holds that Tw 2 (ip) < Twx(<p) + 2 Twi< -^ +1 . 
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Proof. Let T = (W, E), X : W — > 2 V be any tree decomposition of ip having width Tw\(ip). Introduce 
for all c G <p new variables y c , and add y c to every such that V(c) C Clearly, the result is 

a type-I tree decomposition of G(<p), and thus a type-II tree decomposition of ip. Since at most 2^ x ^ w ^ 
many y c are added to X(w) and — 1 < Tw\(tp) for every w G W, the result follows. □ 

This means that if Tw\(tp) is bounded by some constant, then so is Tw2(<p)- Moreover, Tw\(ip) = 



k implies that ip is a A;-CNF; we discuss fc-CNFs in Section \2 and only consider Tw2(<p) here. The 
following proposition states some relationships between type-II treewidth and other restrictions of CNFs 
from above. 

Proposition 4.7 The following properties hold for type-II treewidth. 

(i) There is a family of monotone prime CNFs <p such that Tv)2(<p) is bounded by a constant, but ip 
is not k- CNF for any constant k. 

(ii) There is a family of monotone prime CNFs <p such that Tw2(p) is bounded by a constant, but p 
does not have bounded read. 

(Hi) There is a family of a-acyclic prime CNFs tp such that Tw2{p) is not bounded by any constant. 
(This is a contrast to the graph case that a graph is acyclic if and only if its treewidth is 1.) 

Proof, (i): For example, ip = (Vs-ev^i) nas Tw2(p) = 1, since it has a tree decomposition T = 
(W, E) with X : W 2 V defined by W = {1, 2, . . . ,n}, E = {(w, w + 1), w = 1, 2, . . . , n - 1}, and 
X(w) = {x w ,y c }, w S W, where c = (V^eF x i)- However, it is not an (n — 1)-CNF (but an n-CNF). 



On the other hand, by Lemma 4.8, we can see that there is a family of monotone prime CNFs 99 such 
that Tw2{p>) is not bounded by any constant, but p> is fc-CNF for some constant k. 

(ii) : For example, let p> be a CNF containing n — 1 clauses q = (x\ V X{), i = 2, 3, . . . , n. 
Then ip has Tw2{p>) = 1, since it has a tree decomposition T = (W, E) with X : W —>■ 2 V de- 
fined by W = {(ci,xi),(ci,Xi),i = 2,3, ... ,n}, E = {((a,xi),(ci+i,xi)),i = 2, 3, . . . , n - 1} U 
{((ci,x{), (ci,Xi)),i = 2,3, . . . ,n, and X((ci,x k )) = {y c ,,x k }, (ci,x k ) G W. However, it is not 
read-(n — 2) (but read-(n — 1)). 

(iii) : For example, let <p be a CNF on V = {x\,X2, ■ ■ ■ ,X2n} containing n clauses q = (x{ V 
Vj>n+i x j)> for i = 1, . . . , n. Then <p is a-acyclic. We claim that Tw2{p>) > n — 1. Let us assume that 
there exists a tree T = (W, E 1 ) with X : E — > 2 V that shows Tw2{p) < n — 2, where T is regarded as 
a rooted tree. Let Tj = (Wj, E'j) be the subtree of T induced by W{ = {w £ W \ y Ci & X(w)}, and 
let ri be its root. Consider the case in which W% and Wj are disjoint for some i and j. Suppose that rj 
is an ancestor of rj. Since |X(rj)[ < Tui2{p) + 1 < n — 1, there exists a node x n+k G V such that 
1 < fe < n and x n+ ^ X(ri). However, since the incident graph of p contains two edges (x n+ k, y Ci ) 
and (x n+k , y Cj ), we have x n+k G \J w€Wt -{n} x ( w ) and x n+k G \J weWj X(w). This is a contradiction 
to the condition that {w G W \ x n+k G X(w)} is connected. Similarly, we can prove our claim when 
Ti and Tj are disjoint, but rj is not an ancestor of r^. 

We thus consider the case in which Wj PI Wj 7^ holds for any z and j. Since Tj's are trees, the 
family of Wj, i = 1, 2, . . . , n, satisfies the well-known Helly property, i.e., there exists a node w in 
HiLi Wi- must contain all y c /s. This implies > n, a contradiction. □ 
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As we show now, bounded-treewidth implies bounded degeneracy. 

Lemma 4.8 Let ip be any monotone CNF with Tw2{<p) = k. Then ip is 2 k -degenerate. 

Proof. Let T = (W, E) with X : W — > 2 V show Twify) = k. From this, we reversely construct a 
variable ordering a = a\ , . . . , a n on V = V(<p) such that |A*| < 2 k for all i. 

Set i := n. Choose any leaf w* of T, and let p(w*) be a node in W adjacent to w*. If X(w*) \ 
X(p(w*)) C {y c | c € ip}, then remove u;* from T. On the other hand, if (X(w*) \ X(p(w*))) CiV = 
{xj i: . . . ,Xj e } where I > 1 (in this case, only X(w*) contains Xj x , Xj e ), then define a i+1 _^ t = 
for h = 1, . . . ,£ and update i := n — £, X(w*) := X(w*) \ {xj 1 , . . . , xj e }, and X(w) := X(tt;) \ {y c \ 
c € <p, V(c) D {xji, . . . , Xj £ } 7^ } for every w G TV. Let a be completed by repeating this process. 

We claim that a shows that |A*| < 2 k for alii = 1, . . . , n. To see this, let w* be chosen during this 
process, and assume that a« € X(w*) \ X(p(w*)). Then, by induction on the (reverse) construction of 
a, we obtain that for each clause c € A* we must have either (a) y c <E X(w*) or (b) V(c) C 
The latter case may arise if in previous steps of the process some descendant d(w*) of w* was removed 
which contains y c such that y c does not occur in w*; however, in this case V(c) C X(w) must be true 
on every node on the path from d(w*) to w*. 

Now let q = \X(w*) \ V\. Since \X(w*) \ {a,,}| < k, we have 

|A l | < q + 2 k ~ q < 2 k . 

This proves the claim. □ 



Corollary 4.9 For CNFs (p with Tw 2 ((p) < k, DUALIZATION is solvable (i) with 0(\\tp\\ ■ n 2 +1 ) 
polynomial delay, ifk is constant; and (ii) in polynomial total time, ifk = O (log log \\<p\\). 



4.2 Recursive application of algorithm Dualize 

Algorithm DUALIZE computes in step 2 the prime DNF pr t i \ of the function represented by A l [£]. 
Since A[t] is the prime CNF of some monotone function, we can recursively apply Dualize to A l [t] 
for computing pu,i)- Let us call this variant R-DUALIZE. Then we have the following result. 

Theorem 4.10 If its recursion depth is d, R-Dualize solves DUALIZATION in 0{n d ~ l ■ • \\<p\\) 

time. 

Proof. If d = 1, then A i [t m i n ] = 1 holds for t m i n and every i > 1. This means that PI(f ) = {t m in} 
and ip is a 1-CNF (i.e., each clause in ip contains exactly one variable). Thus in this case, R-Dualize 
needs 0(n) time. Recall that algorithm Dualize needs, by @, time J2tePi(f) J2x x ev(t)(T(t,i)+\P{t,i) \ ' 
0(||^|D). If d = 2, then Tun = 0(n) and \pua\ < 1. Therefore, R-DUALIZE needs time 0(n ■ ■ 
\\<p\\). For d > 3, Corollary 0-( u ) im pl ies tnat R-Dualize needs 0(n d ' 1 ■ • ||^||) time. □ 



Recall that a CNF ip is called k-CNF if each clause in ip has at most k literals. Clearly, if we apply 
algorithm R-Dualize to a monotone /c-CNF ip, the recursion depth of R-Dualize is at most k. Thus 



we obtain the following result; it re-establishes, with different means, the main positive result of M, 15]. 
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Corollary 4.11 R-DUALIZE solves DUALIZATION in 0(n k 1 • \tp\ k 1 • \\<p\\) time, i.e., in polynomial 
total time for monotone k-CNFs p where k is constant. 

5 Limited Nondeterminism 

In the previous section, we have discussed polynomial cases of monotone dualization. In this section, 
we now turn to the issue of the precise complexity of this problem. For this purpose, we consider 
the decision problem DUAL, i.e., decide whether given monotone prime CNFs tp and ip represent dual 
Boolean functions, instead of the search problem DUALIZATION. 

It appears that problem Dual can be solved with limited nondeterminism, i.e., with poly-log many 
guessed bits by a polynomial-time non-deterministic Turing machine. This result might bring new 
insight towards settling the complexity of the problem. 



We adopt Kintala and Fischer's terminology Q32[ ] and write g(n)-P for the class of sets accepted by 



a nondeterministic Turing machine in polynomial time making at most g(n) nondeterministic steps on 
every input of length n. For every integer k > 1, define = \J C (clog fc n)-P. The /9P Hierarchy 
consists of the classes 

P = /3iP c/3 2 P C...c(J/3 fc P = /?P 

k 

and lies between P and NP. The /3fcP classes appear to be rather robust; they are closed under polynomial 
time and logspace many-one reductions and have complete problems (cf. [p3|]). The complement class 
of /3fcP is denoted by co-/3feP. 



We start in Section 5.1 by recalling algorithm A of p7|], reformulated for CNFs and by analyzing 



A's behavior. The proof that A can be converted to an algorithm that uses log 3 n nondeterministic bit 
guesses, and that Dual is thus in co-/?3P, is rather easy and should give the reader an intuition of how 



our new method of analysis works. In Section 5.2, we use basically the same technique for analyzing 



the more involved algorithm B of [17]. Using a modification of this algorithm, we show that DUAL is 
in co-/?2P- We also prove the stronger result that the complement of Dual can be solved in polynomial 
time with only 0(x{n) ■ log(n)) nondeterministic steps (=bit guesses). Finally, Section shows that 
membership in co-/32P can alternatively be obtained by combining the results of [|l7|] with a theorem of 
Beigel and Fu [|[]. 

5.1 Analysis of Algorithm A of Fredman and Khachiyan 

The first algorithm in flT7j] for recognizing dual monotone pairs is as follows. 



Algorithm A (reformulated for CNFs 1 ). 

Input: Monotone CNFs tp, ip representing monotone /, g s.t. V(c)C\V{d) 7^ 0, for all c S (p, d G tp. 
Output: yes if / = g d , otherwise a vector w of form w = (wi, . . . , w m ) such that f(w) ^ g d (w). 



'in [|l7[, duality is tested for DNFs while our problem DUAL speaks about CNFs; this is insignificant, since DNFs are 
trivially translated to CNFs for this task and vice versa (cf. Section g). 
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Step 1: 

Delete all redundant (i.e., non-minimal) clauses from ip and ip. 
Step 2: 

Check that (1) V(<p) = V^), (2) max cg(/ , |c[ < \ip\, (3) max c /^ |c'| < \tp\, and 

(4) S ce ^2-I c l + X C , 6V , 2~l c 'l > 1. 
If any of conditions (l)-(4) fails, f ^ g d and a witness w is found in polynomial time (cf. [|l7|]). 
Step 3: 

If M ■ \ip\ < 1, test duality in O(l) time. 
Step 4: 

If |y| • |^| > 2, find some Xj occurring in tp or ^ (w.l.o.g. in </?) with frequency > l/log(|^| + |^|). 
Let 

<fo = {c - {xj} | x; G c, c G </?i = {c | Xi <£ c, c G ip}, 

ip = {d - {x^ \ Xi G d, d G ^}, = W | Xj ^ cf, d G ^}. 
Call algorithm A on the two pairs of forms: 

(A.l) (</?i 5 Vo A Tpi) and (A.2) (ip u ip A 
If both calls return yes, then return yes (as / = g d ), otherwise we obtain w such that 
f(w) ^ g d {w) in polynomial time (cf. [Jl7|]). 

We observe that, as noted in [|l7|], the binary length of any standard encoding of the input cp, ip to 
algorithm A is polynomially related to \ip\ + \ip\, if step 3 is reached. Thus, for our purpose, we consider 
\ip\ + \ip\ to be the input size. 

Let ip*, ip* be the original input for A. For any pair (ip, ip) of CNFs, define its volume by v = \<p\-\ip\, 
and let e = 1 / log n, where n = 1 99* | + | ^* | . As shown in Ji7| ] , step 4 of algorithm A divides the current 
(sub)problem of volume v = \ip\ ■ \ip\ by self-reduction into subproblems (A.l) and (A.2) of respective 
volumes (assuming that Xj frequently occurs in tp): 

M-l^oAVil < (l-e)-v (7) 

\<P0^<Pl\-\H < M-(M-l) < v-1 (8) 

Let T = T(<p, ip) be the recursion tree generated by A on input (ip, ip). In T, each node u is labeled 
with the respective monotone pair, denoted by I(u); thus, if r is the root of T, then I(r) = (ip, ip). The 
volume v(u) of node u is defined as the volume of its label I(u). 

Any node u is a leaf of T, if algorithm A stops on input I(u) = (ip, ip) during steps 1-3; otherwise, 
u has a left child ui and a right child u r corresponding to (A.l) and (A.2), i.e., labeled (ipi,ipo A ipi) 
and (ipi, (po A ipi) respectively. That is, u\ is the "high frequency move" by the splitting variable. 

We observe that every node u in T is determined by a unique path from the root to u in T and thus 
by a unique sequence seq(u) of right and left moves starting from the root of T and ending at u. The 
following key lemma bounds the number of moves of each type for certain inputs. 

Lemma 5.1 Suppose \ip*\ + \ip*\ < \<p* \ ■ \ip*\. Then for any node a in T, seq(a) contains at most v* 
right moves and at most log 2 v* left moves, where v* = \ip*\ ■ \ip*\. 



14 



Proof. By (0) and (g), each move decreases the volume of a node label. Thus, the length of seq(u), and 
in particular the number of right moves, is bounded by v*. To obtain the better bound for the left moves, 
we will use the following well-known inequality: 

(l-l/y) y < 1/e, fory>l. (9) 

In fact, the sequence (1 — l/yi) y \ for any 1 < y\ < yz < ■ ■ ■ monotonically converges to 1/e from 
below. By (0), the volume v(u) of any node u such that seq(u) contains log 2 v* left moves is bounded 
as follows: 

v(u) < v* ■ (1 - e) log2 ^ = v* • (1 - l/logn) log2v *. 
Since n = \(p*\ + \i/)*\ < \tp*\ ■ \ = v*, and because of (|) it follows that: 

v(u) < v* ■((l-l/logv*y°z v *) logv * 

< v* ■ (l/e) logv * = v*/(e logv *) < v*/{2 losv *) = 1. 

Thus, u must be a leaf in T. Hence for every u in T, seq(u) contains at most log 2 v* left moves. □ 



Theorem 5.2 Problem Dual is in co-foP. 

Proof. Instances such that either cflc' = for some c G tp* and c' £ tp*, the sequence seq(u) is 
empty, or \ip*\ + \ifj*\ > \ip*\ ■ \if)*\ are easily recognized and solved in deterministic polynomial time. 
In the remaining cases, if / ^ g d , then there exists a leaf u in T labeled by a non-dual pair (ip', tp'). If 
seq(u) is known, we can compute, by simulating A on the branch described by seq(u), the entire path 
uo, u\, . . . , ui = u from the root uq to u with all labels I(uq) = (ip*, -0*), I(ui), ... , I(ui) and check 
that I(ui) is non-dual in steps 2 and 3 of A in polynomial time. Since the binary length of any standard 
encoding of (<p*,ip*) is polynomially related to n = \(p*\ + if seq(u) is nonempty, to prove the 
result it is sufficient to show that seq(u) can be constructed in polynomial time from 0(log 3 v*) suitably 
guessed bits. To see this, let us represent every seq(u) as a sequence seq*(u) = [^o, (-2 ■ ■ ■ Ak\> where 
Iq is the number of leading right moves and is the number of consecutive right moves after the i-th left 
move in seq(u), for i = 1, . . . , k. For example, if seq(u) = [r, r, 1, r, r, r, 1], then seq*(u) = [2, 3, 0]. 
By Lemma fTl| , seq* (u) has length at most log 2 v* + 1. Thus, seq* (u) occupies in binary only 0(log 3 v) 
bits; moreover, seq(u) is trivially computed from seq*(u) in polynomial time. □ 



5.2 Analysis of Algorithm B of Fredman and Khachiyan 

The aim of the above proof was to exhibit a new method of algorithm analysis that allows us to show 
with very simple means that duality can be polynomially checked with limited nondeterminism. By 
applying the same method of analysis to the slightly more involved algorithm B of [17] (which runs in 
n 4x(n)+0(i) t j me) anc j j n n o(iogn) t j me ^ we can snar p en the above result by proving that deciding 

whether monotone CNFs ip and ip are non-dual is feasible in polynomial time with 0(x( n ) ' logn) 
nondeterministic steps; consequently, the problem Dual is in co-^P- 
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Like algorithm A, also algorithm B uses a recursive self-reduction method that decomposes its 
input, a pair (p, ip) of monotone CNFs, into smaller inputs instances for recursive calls. Analogously, 
the algorithm is thus best described via its recursion tree T, whose root represents the input instance 
(<p*, ip*) (of size n), whose intermediate nodes represent smaller instances, and whose leaves represent 
those instances that can be solved in polynomial time. Like for algorithm A, the nodes u in T are labeled 
with the respective instances I(u) = (ip, ip) of monotone pairs. Whenever there is a branching from a 
node u to children, then I(u) is a pair of dual monotone CNFs iff I(u') for each child u' of u in T is 
a pair of dual monotone CNFs. Therefore, the original input (ip*, ip*) is a dual monotone pair iff all 
leaves of T are labeled with dual monotone pairs. 

Rather than describing algorithm B in full detail, we confine here to recall those features which are 
relevant for our analysis. In particular, we will describe some essential features of its recursion tree T. 

For each variable x\ occurring in p, the frequency ef of Xi w.r.t. p is defined as ef = i ce ^^p gc } ; 
i.e., as the number of clauses of p containing x,- L divided by the total number of clauses in p. Moreover, 
for each v > 1, let xi v ) be defined by x( v ) x ^ = v - 

Let v* = \p*\\ip*\ denote the volume of the input (=root) instance (p*,ip*). For the rest of this 
section, we assume that \p* \ + \ip*\ < \p*\ ■ \ip*\. In fact, in any instance which violates this inequality, 
either p* or ip* has at most one clause; in this case, Dual is trivially solvable in polynomial time. 

Algorithm B first constructs the root r of T and then recursively expands the nodes of T. For each 
node u with label I(u) = (p, ip), algorithm B does the following. 

The algorithm first performs a polynomial time computation, which we shall refer to as LCheck^, ip) 
here, as follows. LCheck^, ip) first eliminates all redundant (i.e., non-minimal) clauses from p and ip 
and then tests whether some of the following conditions is violated: 

1. V(p) = V(iP); 

2. max c£v |c| < \ip\ and max cg ^, |c| < \p\; 

3. min(M, \ip\) > 2. 

If LCHECK(p, ip) = true, then u is a leaf of T (i.e., not further expanded); whether I(p, ip) is a dual 
monotone pair is then decided by some procedure Test(^>, ip) in polynomial time. In case Test(<£>, ip) 
returns false, the original input (p* , ip*) is not a dual monotone pair, and algorithm B returns false. 
Moreover, in this case a counterexample w to the duality of p* and ip* is computable in polynomial 
time from the path leading from the root r of T to u. 

If LCheck((/9, ip) returns false, algorithm B chooses in polynomial time some appropriate variable 
x,i such that ef > and ef > 0, and creates two or more children of u by deterministically choosing 
one of three alternative decomposition rules (i), (ii), and (iii). Each rule decomposes I(u) = (p, ip) 
into smaller instances, whose respective volumes are summarized as follows. Let, as for algorithm A, 
Po = {c - {xi} I Xi G c, c <G p}, pi = {c I Xi <fc c, c G p}, ip = {d - {xi} \ Xi G c', d G ip}, and 
ipi = {d j Xi £ d, d G ip}. Furthermore, define e(v) = l/x(v), for any v > 0. 

Rule (i) If ef < e(v(u)), then I(u) is decomposed into: 

a) one instance (pi,ipo A ip\) of volume < (1 — ef ) ■ v(u); 
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b) \tpo\ instances h,. ■ ■ , I\^ \ of volume < ef ■ v(u) each. Each such instance Ij corresponds to 
one clause of V'o an d can thus be identified as the j-th clause of ipo with an index j < \ipo\ < 
n (recall that n denotes the size of the original input). 

Rule (ii) If ef > e(v(u)) > ef, then I(u) is decomposed into: 

a) one instance (Y>i, cpo A ipi) of volume < (1 — ef ) • v(u); 

b) \ipo \ instances h,. ■ ■ , 1^ of volume < ef ■ v(u) each. Each such instance Ij corresponds to 

one clause of 990 and can be identified by an index j <\<fo\ <v*. 

Rule (iii) If both ef > e(v(u)) and ef > e(v(u)), then / is decomposed into: 

Co) one instance of volume < (1 — ef) ■ v(u), and 
Ci) one instance of volume < (1 — ef ) ■ v(u). 

Algorithm B returns true iff Test(/(u)) returns true for each leaf u of the recursion tree. This concludes 
the description of algorithm B. 

For each node u and child v! of u in T, we label the arc (u,u r ) with the precise type of rule that 
was used to generate u' from u. The possible labels are thus (i.a), (i.b), (ii.a), (ii.b), (iii.Co), and (iii.ci). 
We call (i.a) and (ii.a) a-labels, (i.b) and (ii.b) b-labels, and (iii.Co) and (iii.ci) c-labels. Any arc with a 
6-label is in addition labeled with the index j of the respective instance Ij in the decomposition, which 
we refer to as the j-label of the arc. 

Definition 5.1 For any node u of the tree T, let seq{u) denote the sequence of all edge-labels on the 
path from the root rofTto u. 

Clearly, if seq(u) is known, then the entire path from r to u including all node-labels (in particular, 
the one of u) can be computed in polynomial time. Indeed, the depth of the tree is at most v*, and adding 
a child to a node of T according to algorithm B is feasible in polynomial time. 

The following lemma bounds the number of various labels which may occur in seq(u). 

Lemma 5.3 For each node u in T, seq(u) contains at most (i) v* many a-labels, (ii) logv* many 
b-labels, and (iii) log 2 v* many c-labels. 

Proof, (i) Let us consider rule (i.a) first. Given that ef > 0, xi effectively occurs in some clause 
of cp. Thus \<pi\ < \(p\. Moreover, by definition of ipo and tpi, \ipo A < \ip\. Thus we have 
\cpi I • \ipo A V'l I < M • IV'I- ^ follows that whenever rule (i.a) is applied, the volume decreases (at least 
by 1). The same holds for rule (ii.a) by a symmetric argument. Since no rule ever increases the volume, 
there are at most v* applications of an a-rule. 

(ii) Assume that rule (i.b) is applied to generate a child t' of node t. By condition 3 of LCheck, 
v(t) > 4. Therefore, x(v(t)) > 2 and thus ef < e(v(t)) < 1/2. It follows that v(t') < v(t)/2. The 
same holds if t' results from t via rule (ii.b). Because no rule ever increases the volume, any node 
generated after (among others) log v* applications of a fe-rule has volume < 1 and is thus a leaf in T. 
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(iii) If a c-rule is applied to generate a child t' of a node t, and since e(v(t)) > e(v*) > 1/logu*, 
the volume of v(t) decreases at least by factor (1 — 1/ log v*). Thus, the volume of any node u which 
results from t after logv* applications of a c-rule satisfies v(u) < v(t)(l — 1/ log v *y°z v < v(t)/e by 
(Q); i.e., the volume has decreased more than half. Thus, any node u resulting from the root of T after 

„ / 1 \ log v* 

log v* applications of a c-rule satisfies v(u) < v* • ( | J =1; that is, n is a leaf in T. □ 



Theorem 5.4 Deciding whether monotone CNFs (p and ip are non-dual is feasible in polynomial time 
with 0(log 2 n) nondeterministic steps, where n = \tp\ + \ip\- 



Proof. As in the proof of Theorem |5.2[ , we use a compact representation seq*{u) of seq(u). However, 



here the definition of seq* is somewhat more involved: 

• seq*{u) contains all 6-labels of seq{u), which are the anchor elements of seq*{u). Every 6-label 
is immediately followed by its associated j-label, i.e., the label specifying which of the (many) 
o-children is chosen. We call a o-label and its associated j-label a bj-block. 

• At the beginning of seq* (u), as well as after each bj -block, there is an ac-block. The first ac-block 
in seq*(u) represents the sequence of all a- and c-labels in seq(u) preceding the first 6-label in 
seq(u), and the z-th ac-block in seq*{u), i > 1, represents the sequence of the a and c labels 
(uninterrupted by any other label) following the (i — l)-st oj-block in seq(u). 

Each ac-block consists of an a-block followed by a 7-block, where 

- the a-block contains, in binary, the number of a-labels in the ac-block, and 

- the 7-block contains all c-labels (single bits) in the ac-block, in the order as they appeal - . 

For example, if s = "(i.a), (u.a),co, (ii.a), c\, Co, (i.a)" is a maximal ac-subsequence in seq(u), 
then its corresponding ac-block in seq*(u) is "10, cq, c\, cq", where 10 (= 4) is the a-block (stating that 
there are four a-labels) and "co, c\, cq" is the 7-block enumerating the c-labels in s in their correct order. 

The following facts are now the key to the result. 

Fact A. Given cj)*,i/)* and a string s, it is possible to compute in polynomial time the path r = uq, u\ , . . . , 
ui = u from the root r of T to the unique node u in T such that s = seq*(u) and all labels I(ui), 
or to tell that no such node u exists (i.e., s / seq*(u) for every node u in T). 

This can be done by a simple procedure, which incrementally constructs uq, u\, etc as follows. 
Create the root node r = no, and set I(uq) = (<j)*,ip*) and t := 0. Generate the next node 1^44 and 
label it, while processing the main blocks (ac-blocks and ftj-blocks) in s in order, as follows: 

ac-block: Suppose the a-block of the current ac-block has value n a , and the 7-block contains labels 
71, . . . , 7fc. Set up counters p := and q := 0, and while p < n a or q < k, do the following. 

If LCheck(I(m 4 )) = true, then flag an error and halt, as s / seq*(u) for every node u in T. 
Otherwise, determine the rule type r G {(i), (ii), (iii)} used by algorithm B to (deterministically) 
decompose I(ut). 
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• If r G {(i), (ii)} and p < n a , then assign I(ut+i) the a-child of I{ut) according to algo- 
rithm B, and increment p and t by 1. 

• If r = (iii) and q < k, then increment gby 1, assign I(ut+\) the 7^-child of I(ut) according 
to algorithm B, and increment t by 1. 

• In all other cases (i.e., either r G {(i), (ii)} and p > n a , or r = (iii) and q > k), flag an 
error and halt, since s / seq* (u) for every node u in T. 

6j-block: Determine the rule type r G {(i), (ii), (iii)} used by algorithm B to (deterministically) de- 
compose I(ut). If t = (iii), then flag an error and halt, since s ^ seq*(u) for every node u in T. 
Otherwise, assign I(u t +i) the j'-th (r.b)-child of I{u t ) according to rule (r.b) of algorithm B, 
where f is the j-label of the current 67-block. 

Clearly, this procedure outputs in polynomial time the desired labeled path from r to u, or flags an 
error if s 7^ seq*(u) for every node u in T. 

Let us now bound the size of seq*(u) in terms of the original input size v*. 

Fact B. For any u in T, the size of seq*(u) is 0(log 2 v*). 



By Lemma |5.3| (ii), there are < log v* 6j-blocks. As already noted, each 6j-block has size 0(log v*)\ 
thus, the total size of all bj -blocks is 0(log 2 v*). Next, there are at most log v* many ac-blocks and thus 



a-blocks. Each a-block encodes a number of < v* a-rule applications (see Lemma |5.3[(i)), and thus 



uses at most log v* bits. The total size of all a-blocks is thus at most log 2 v*. Finally, by Lemma |53| (iii), 
the total size of all 7-blocks is at most log 2 v* . Overall, this means that seq*(u) has size O(logV). 

To prove that algorithm B rejects input (ip*,ip*), it is thus sufficient to guess seq*(u) for some leaf 
u in T, to compute in polynomial time the corresponding path r = uq, U\, . . . ,ui = u, and to verify 
that LCheck(/(u)) = true but Test(/(u)) = false. Therefore, non-duality of 4>* and ip* can be 
decided in polynomial time with 0(log 2 v*) bit guesses. Given that v* < n 2 , the number of guesses is 
0(log 2 n 2 ) = 0(log 2 n). □ 

The following result is an immediate consequence of this theorem. 
Corollary 5.5 Problem Dual is in co-/?2P and solvable in deterministic n°^ logn ^ time, where n = 

M + M- 

(Note that Yes-instances of Dual must have size polynomial in n, since dual monotone pairs (99, ip) 



must satisfy conditions (2) and (3) in step 2 of algorithm A.) We remark that the proof of Lemma 5.3 



and Theorem |5/4| did no stress the fact that e(v) = 1/x (v ) ; the proofs go through for e[v) = 1/ log v as 



well. Thus, the use of the x-function is not essential for deriving Theorem |5/4 . 

However, a tighter analysis of the size of seq* (u) stressing \ i v ) yields a better bound for the number 
of nondeterministic steps. In fact, we show in the next result that 0(x{n) ■ logn) bit guesses are 
sufficient. Note that x( n ) = o(logn), thus the result is an effective improvement. Moreover, it also 
shows that Dual is most likely not complete for co-/?2P- 
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Theorem 5.6 Deciding whether monotone CNFs f and ip are non-dual is feasible in polynomial time 
with 0(x(n) logra) nondeterministic steps, where n=\<p\ + 



Proof. In the proof of Theorem 5.4, our estimates of the components of seq*(u) were rather crude. 



With more effort, we establish the following. 

Fact C. For any u in T, the size of seq*(u) is 0(x{v*) ■ log(v*)). 

Assume node u' in T is a child of u generated via a 6-rule. The j -label of the arc (u,u') serves to 
identify one clause of I(u). Clearly, there are no more than v(u) such clauses. Thus log v(u) bits suffice 
to represent any j-label. 

Observe that if u is a node of T, then any path n from u to a node w in T contains at most v (u) 
nodes, since the volume always decreases by at least 1 in each decomposition step. Thus, the number of 
a-labeled arcs in it is bounded by v(u) and not just by v* (= v{r)). 

For each node u and descendant w of u in T, let 

f(u,w)= \ogv(u'), 

u' aB(u,w) 

where B(u, w) is the set of all nodes t on the path from uto w such that the arc from t to its successor 
on the path is 6-labeled. 

By what we have observed, the total size of all encodings of j-labels in seq*(u) is at most f(v*,u) 
and the size of all a-blocks in seq*(u) is at most log(v*) + f(v*, u), were the first term takes care of 
the first a-block and the second of all other a-blocks. Therefore, the total size of all a-blocks and all 
6j-blocks in seq*(u) is 0(f(v*,u) + log(v*)). 

We now show that for each node u and descendant w of u in T, it holds that 

f(u,w) < log(v(u)) ■ x(v(u)). 

The proof is by induction on the number \B(u,w)\ of 6-labeled arcs on the path n from u to w. If 
\B(u,w)\ = 0, then obviously f(u,w) = < v(u). 

Assume the claim holds for \B(u', w)\ < i and consider \B(u, w)\ = i + 1. Let t be the first node 
on 7T contained in B(u, w), and let t' be its child on tt. Clearly, f(u, w) = f(t, w), and thus we obtain: 

f(u,w)= log(v(t))+f(t',w) 

< log(v(t)) + log(v(t')) ■ x(v(t')) (induction hypothesis) 

< log(«(t)) + (log«i)) - log(xM*)))) • x(v(t)) (as v(t') < X (v(t')) < ) 
= log(f (t)) ■ x(v(t)) (as log(x(y)) • x(y) = log y, for all y). 

Thus, /(u, w) < log(v(u)) ■ x( v ( u ))- This concludes the induction and proves the claim. 

Finally, we show that the total size of all 7 blocks in seq*(u), i.e., the number of all c-labels in 
seq(u), is bounded by x( v *) ' log(u*) < log 2 v*. Indeed, assume a c-rule is applied to generate a child 
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t' of any node t, and let v = v(t), v' = v(t'). Since ef > e(v) and ef > e(v), we have v' < (l — e(v))-v. 
Since x( v *) > x( v )' we nave e ( v ) = l/x( u ) > l/xi v *) an d thus 



v'<(l ' 



V. 



Hence, any node in T resulting after x( v *) ' log(«*) applications of a c-rule has volume at most 



/ 1 \x(v*)-logv* _ # r/ 



x(u*) 



! 1_ \xK) 



log v * f 1 \ log l>* 

<«*.(-) <1 



(cf. also (||)). Consequently, along each branch in T there must be no more than x( v *) ' log applica- 
tions of a c-rule. In summary, the total sizes of all o-blocks, all 7-blocks, and all encodings of j-labels 
in seq*(u) are all bounded by x( v *) ' log v *- This proves Fact C. 

As a consequence, non-duality of a monotone pair (ip* , ip* ) can be recognized in polynomial time 



with 0(x(v*) ■ log v*) many bit guesses. As already observed on the last lines of [17], we have x( v *) < 
2x(n). Furthermore, v* < n 2 , thus logv* < 21ogn. Hence, non-duality ((p*,ip*) can be recognized in 
polynomial time with 0(x(n) ■ log(n)) bit guesses. □ 

Corollary 5.7 Problem Dual is solvable in deterministic n°^ x( - n ^ time, where n = \tp\ + |^|. 

Remark 5.1 Note that the sequence seq{u) describing a path from the root of T to a "failure leaf" 
with label I(u) = ((p f ,tp f ) describes a choice of values for all variables in V(ip Aip)\ V((p' A ip'). 
By completing it with values for V(ip' A ip') that show non-duality of (ip',ip'), which is possible in 
polynomial time, we obtain in polynomial time from seq(u) a vector w such that f(w) ^ g d (w). It 
also follows from the proof of Theorem ^6| that a witness w for / ^ g d (if one exists) can be found in 
polynomial time with 0(x(n) ■ logn) nondeterministic steps. 

5.3 Application of Beigel and Fu's results 

While our independently developed methods substantially differ from those in M, W, membership of 
problem Dual in co-^P may also be obtained by exploiting Beigel and Fu's Theorem 8 in ^ (or, 
equivalently, Theorem 11 in [Q]). They show how to convert certain recursive algorithms that use dis- 
junctive self -reductions, have runtime bounded by f(n), and fulfill certain additional conditions, into 
polynomial algorithms using log(/(n)) nondeterministic steps (cf. [§, Section 5]). 

Let us first introduce the main relevant definitions of [jl|]. Let \\y\\ denote the size of a problem 
instance y. 



Definition 5.2 (||Ip) A partial order -< (on problem instances) is polynomially well-founded, if there 
exists a polynomial-bounded function p such that 

• Vm -< ■ ■ ■ -< yi m < p(\\yi\\) and 

• Vm -< < Vl hm\\ < P(\\Vl\\)- 
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For technical simplicity, ^ considers only languages (of problem instances) containing the empty 
string, A. 



Definition 5.3 (Qlp) A disjunctive self -reduction (for short, d-self-reduction) for a language L is a pair 
(h, -<) of a polynomial-time computable function h(x) = {x\, . . . , x m } and a polynomially well- 
founded partial order -< on problem instances such that 

• A is the only minimal element under -<; 

• for all x / A, x G L = h(x) n L / 0; 

• for all x, Xi G h(x) => xi -< x. 



Definition 5.4 (Qlp) Let (h, -<) be a d-self-reduction and let x be a problem instance. 

• Tf l ^(x) is the unordered rooted tree that satisfies the following rules: (I) the root is x; (2) for 
each y, the set of children ofy is h(y). 

• \Th^ (x) | is the number of leaves in Th^(x). 

Definition 5.5 ( [jl)) Let T be a polynomial-time computable function. A language L is in REC(T(a;)), 
if there is a d-self-reduction {h, -<) for L such that for all x 

1. \Th^{x)\ < T(x), and 

2- T(x)>^ Xieh{x) T( Xi ). 

Let T(x)-P denote the set of all (languages of) problems whose Yes-instances x are recognizable in 
polynomial time with T(x) nondeterministic bit guesses. 

Theorem 5.8 ( [§) REC(T(x)) C [logT(x)]-P 



We now show that Theorem |5.8| , together with Fredman's and Khachiyan's proof of the deterministic 
complexity of algorithm B, can be used to prove that problem Dual is in co-^P- 

Let L denote the set of all non-dual monotone pairs (ip, ip) plus A. Let us identify each monotone 
pair (ip, ip) which satisfies LCheck((^, ip) but does not satisfy Test((^, ip) with the "bottom element" 
A. Thus, if a node in the recursion tree T has a child labeled with such a pair, then the label is simply 
replaced by A. 

Let us define the order -< on monotone pairs plus A as follows: J -< I, if I ^ J and either J = A 
or J labels a node of the recursion tree generated by algorithm B on input /. It is easy to see that both 



conditions of Definition 5.2 apply; therefore, -< is polynomially well-founded. In fact, we may define 
the polynomial p by the identity function; since the sizes of the instances in the recursion tree strictly 
decrease on each path in T, the two conditions hold. 
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Define h as the function which associates with each monotone pair I = (tp,ip) those instances 
that label all children of the root by algorithm B on input /. Clearly h satisfies all three conditions of 



Definition 5.3, and hence (h, -<) is a d-self-reduction for L. 

Let T be the function which to each instance I associates v(I) lo£V ^ (recall that v(I) denotes the 
volume of I). It is now sufficient to check that conditions 1 and 2 of Definition are satisfied, and to 



ensure that Theorem can be applied. 



That item 1 of Definition is satisfied follows immediately from Lemma 5 in [|17y, which states 
that the maximum number of recursive calls of algorithm B on any input / of volume v is bounded by 
v xw (< v lo s v y Retain, however, that the proof of this lemma is noticeably more involved than our 
proof of the membership of DUAL in co-^P- 

To verify item 2 of Definition ||3|, it is sufficient to prove that for a volume v > 4 of any input 
instance to algorithm B, it holds that 

3 W ' 
where a = 1 



..log v 



Jog v 



> 



(„ _ l)kg(«-l) + 

> 2(a ■ v) lo ^ a ^ 



and 

1/logv; 



(10) 
(11) 



here, ( JT0| ) arises from the rules (i), (ii) and (11) from rule (iii). As for dl0|), the child of u from (i.a) 
resp. (ii.a) has volume at most v — 1, and there are at most v/S many children from(i.b) resp. (ii.b), 
since min(|<^|, |^|) > 2 (recall that v = \ip\ ■ |?/>|); furthermore, each such child has volume < e(v) ■ v < 
In case of (11 ), the volume of each child of u is bounded by (1 — e(v)) ■ v < (1 — 1/ log v) ■ v; note 
also that v gv monotonically increases for v > 4. To see <]To[), we have 

(„ _ i)log(«-l) + v . (v)^ - 



2 J 



< (v - 
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to show (|11|), note that 

2(a ■ v) 1 °g( Q! " u ) = 2a logv+loga • v logv+loga 

< 2(- • a loga ) • v log v+lo s Q 
= | . ( a • u) logQ • v logv 

< 1 . ^logt; 



(a logv < 1/e, by®) 

(a • u) loga < 1, i.e., log a • (log a + log v) < 0, 
since —1 < log a < and log v > 2 



< v 



log » 



We can thus apply Theorem and conclude that the complement of Dual is in [log T(x)~\ -P, and 
thus also in /^P. 

The advantage of Beigel and Fu's method is its very abstract formulation. The method has two 



disadvantages, however, that are related to the two items of Definition 5.5 
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The first item requires that T(x) is at least the number of leaves in the tree for x. In order to show 
this, one must basically prove a deterministic time bound for the considered algorithm (or at least a 
bound of the number of recursive calls for each instance, which is often tantamount to a time-bound). 
The method does not suggest how to do this, but presupposes that such a bound exists (in the present 
case, this was done by Fredman and Khachiyan in a nontrivial proof). The second item requires to prove 
that the T-value of any node x in the recursion tree is at least the sum of the T-values of its children. 
This may be hard to show in many cases, and does not necessarily hold for every upper bound T. 

Our method instead does not require an a priori time bound, but directly constructs a nondetermin- 
istic algorithm from the original deterministic algorithm, which lends itself to a simple analysis that 
directly leads to the desired nondeterministic time bound. The deterministic time bound follows as an 
immediate corollary. It turns out (as exemplified by the very simple proof of Theorem ^4) that the 
analysis involved in our method can be simpler than an analysis according to previous techniques. 



6 Conclusion 

We have presented several new cases of the monotone dualization problem which are solvable in output- 
polynomial time. These cases generalize some previously known output-polynomial cases. Further- 
more, we have shown by rather simple means that non-dual monotone pairs (ip, tp) can be recognized, 
using a nondeterministic variant of Fredman and Khachiyan's algorithm B JT7|], in polynomial time with 
0(log 2 n) many bit guesses, which places problem Dual in the class co-^P- In fact, a refined analysis 
revealed that this is feasible in polynomial time with 0(x(n) ■ logn) many bit guesses. 

While our results document progress on DUAL and DUALIZATION and reveal novel properties of 
these problems, the question whether dualization of monotone pairs (ip, is feasible in polynomial 
time remains open. It would be interesting to see whether the amount of guessed bits can be further 
significally decreased, e.g., to OQoglogw • log v) many bits. 
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